Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets
نویسندگان
چکیده
منابع مشابه
Generative Oversampling for Mining Imbalanced Datasets
One way to handle data mining problems where class prior probabilities and/or misclassification costs between classes are highly unequal is to resample the data until a new, desired class distribution in the training data is achieved. Many resampling techniques have been proposed in the past, and the relationship between resampling and cost-sensitive learning has been well studied. Surprisingly...
متن کاملWEMOTE - Word Embedding based Minority Oversampling Technique for Imbalanced Emotion and Sentiment Classification
Imbalanced training data always puzzles the supervised learning based emotion and sentiment classification. Several existing research showed that data sparseness and small disjuncts are the two major factors affecting the classification. Target to these two problems, this paper presents a word embedding based oversampling method. Firstly, a large-scale text corpus is used to train a continuous ...
متن کاملOversampling Method for Imbalanced Classification
Classification problem for imbalanced datasets is pervasive in a lot of data mining domains. Imbalanced classification has been a hot topic in the academic community. From data level to algorithm level, a lot of solutions have been proposed to tackle the problems resulted from imbalanced datasets. SMOTE is the most popular data-level method and a lot of derivations based on it are developed to ...
متن کاملA Synthetic Minority Oversampling Method Based on Local Densities in Low-Dimensional Space for Imbalanced Learning
Imbalanced class distribution is a challenging problem in many real-life classification problems. Existing synthetic oversampling do suffer from the curse of dimensionality because they rely heavily on Euclidean distance. This paper proposed a new method, called Minority Oversampling Technique based on Local Densities in Low-Dimensional Space (or MOT2LD in short). MOT2LD first maps each trainin...
متن کاملA Classification Model for Imbalanced Medical Data based on PCA and Farther Distance based Synthetic Minority Oversampling Technique
Medical data are extensively used in the diagnosis of human health. So it has played a vital role for physicians as well as in medical engineering. Accordingly, many types of research are going on related to this to have a better prediction of the diseases or to improve the diagnosis quality. However, most of the researchers work on either dimensionality space or imbalanced data. Due to this, s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Distributed Sensor Networks
سال: 2017
ISSN: 1550-1477,1550-1477
DOI: 10.1177/1550147717703116